A structure-based mechanism for displacement of the HEXIM adapter from 7SK small nuclear RNA

Productive transcriptional elongation of many cellular and viral mRNAs requires transcriptional factors to extract pTEFb from the 7SK snRNP by modulating the association between HEXIM and 7SK snRNA. In HIV-1, Tat binds to 7SK by displacing HEXIM. However, without the structure of the 7SK-HEXIM complex, the constraints that must be overcome for displacement remain unknown. Furthermore, while structure details of the TatNL4-3-7SK complex have been elucidated, it is unclear how subtypes with more HEXIM-like Tat sequences accomplish displacement. Here we report the structures of HEXIM, TatG, and TatFin arginine rich motifs in complex with the apical stemloop-1 of 7SK. While most interactions between 7SK with HEXIM and Tat are similar, critical differences exist that guide function. First, the conformational plasticity of 7SK enables the formation of three different base pair configurations at a critical remodeling site, which allows for the modulation required for HEXIM binding and its subsequent displacement by Tat. Furthermore, the specific sequence variations observed in various Tat subtypes all converge on remodeling 7SK at this region. Second, we show that HEXIM primes its own displacement by causing specific local destabilization upon binding — a feature that is then exploited by Tat to bind 7SK more efficiently.

T ranscription of all class II genes is a highly regulated process within cells. Shortly after promoter clearance, RNA Polymerase II is inhibited by negative elongation factors [1][2][3][4][5] . Release from this stalled state requires all components to be phosphorylated by the positive elongation factor pTEFb, a heterodimeric complex consisting of Cyclin T1 and the cyclindependent kinase Cdk9 [6][7][8][9][10][11][12] . However, most of the pTEFb is kept catalytically inactive in the nucleus by the 7SK small nuclear ribonucleoprotein (7SK snRNP) through its interactions with the HEXIM adapter protein and the 5' stemloop-1 of the 7SK RNA [13][14][15][16][17][18][19][20][21][22][23] (7SK-SL1). Thus, productive transcriptional elongation of many genes requires transcriptional factors to extract pTEFb from the 7SK snRNP-a process that involves manipulating the interaction between HEXIM and 7SK. This association between 7SK and HEXIM tightly controls the balance between active and inactive pTEFb, and dysregulation of this interaction can have serious biological consequences, including cardiac hypertrophy and breast and pancreatic cancers [24][25][26][27] . Furthermore, as many viruses rely on the host transcriptional machinery to produce mRNA and genomes, they have also evolved mechanisms to capture pTEFb [28][29][30] . One such unique case is the human immunodeficiency virus (HIV), which utilizes the viral Tat protein to extract pTEFb by binding to the same region of 7SK as HEXIM and directly displacing it [30][31][32][33][34] . Structural insights into the consequence of HEXIM binding to 7SK and how positive transcriptional factors like Tat compete with it are therefore important for understanding HEXIM's potency as a critical negative regulator.
Our previous work showed that 7SK-SL1 apical is enriched in arginine sandwich motifs (ASMs) 45 . ASMs are defined by two nucleotides that stack in a manner that allows for intercalation of arginine guanidinium moieties between the aromatic rings of the bases [45][46][47][48][49][50][51] . While a bulge pyrimidine forms the cap by engaging in a triple-base interaction with an n + 2 base pair in the stem, a Watson-Crick base-paired nucleotide preceding the bulge forms the base of the interaction. In the free 7SK-SL1 apical , three such bulges fold into preformed arginine sandwich motifs (ASM 1 , ASM 2 , and ASM 4 ) poised for arginine guanidinium moieties to dock into them. A fourth bulge folds into a pseudo configuration (pseudo-ASM 3 ) where U 40 can form a triple-base interaction with the A 43 -U 66 base pair to form the cap, but the base of the sandwich is sequestered in a reverse Hoogsteen interaction, excluding it from use as a classical ASM. Our work also showed that HIV-1 Tat NL4-3 (Tat NL4-3 ) uses its arginine-rich motif to intercalate arginines not only into the three preformed ASMs, but also to remodel the pseudo-ASM into a classical ASM 45 . This structural remodeling of pseudo-ASM 3 is a key mechanism through which Tat displaces HEXIM.
However, without the structure of the HEXIM:7SK-SL1 apical interaction, it is currently unclear what structural constraints Tat would need to overcome to access pTEFb. Furthermore, while the Tat ARM is highly conserved, sequence variations exist in different strains that allow for HEXIM displacement. For example, the ARM of Tat Finland (Tat Fin ; KR 52 KHRRR) differs from HEXIM (KK 151 KHRRR) by only a single amino acid and would lack one of the ASM interactions from the previously described Tat NL4-3 strain (KR 52 RQRRR). Additionally, while Tat subtype G (Tat G ; KR 52 R 53 HRRR) has an equivalent number of arginines as Tat NL4-3 , the critical linker sequence connecting the ASM 3 / ASM 4 and ASM 1 /ASM 2 interactions is the same as in HEXIM. In this study, we present the structure of the 7SK-SL1 apical in complex with the HEXIM, Tat Fin , and Tat G ARMs. Despite sequence variations, the structures show deep major groove intercalations of all ARMs, albeit with differential interactions with pseudo-ASM 3 and ASM 4 . Furthermore, we show that HEXIM causes local destabilization of ASM 4 , enhancing Tat's affinity for 7SK. These studies thus uncover a feature in which HEXIM facilitates its own displacement by increasing conformational sampling, which may be a more general mechanism of pTEFb capture.

Results
Comparative binding affinities of HEXIM and Tat to 7SK. As a first step toward identifying the comparative thermodynamic properties of 7SK recognition between HEXIM and Tat, we performed binding studies using isothermal titration calorimetry (ITC). ITC traces of the ARMs into 7SK-SL1 apical-AGU produce significant nonspecific heats of binding as previously observed 52 . In a previous study by Brillet et al., high salt conditions (0.5 M NaCl) were used to abrogate such nonspecific interactions that stem from charge-charge interactions between the positively charged peptides and the negative RNA backbone 52 . While such a strategy is commonly used, it is not ideal for this system as the structuring of ASMs in 7SK is highly sensitive to ionic conditions and folds only around physiological salt conditions (Supplementary Figs. 1, 2) 45 . Therefore, to subtract the nonspecific heats of binding, we designed a control construct that lacks all ASMs (7SK-SL1 apicalΔASM ). Indeed, the heats obtained from peptide titrations into this control construct completely accounted for the nonspecific heats, the subtraction of which allowed for experimental baselines to approach zero at saturation ( Supplementary  Fig. 2b). Titration of the N-terminal ARM residues of HEXIM (R 146 QLGKKKHRRR 156 ; HEXIM N-ARM ) into 7SK-SL1 apical-AGU containing an AGU triloop engineered to prevent low levels of dimerization gave a K d of 229 ± 20 nM (N = 1 ± 0.1; Fig. 1a). The redesign of the previously used GAGA tetraloop 45 to an AGU triloop was done to prevent weak association between the tyrosine in the peptide and the tetraloop. Nevertheless, while the affinities obtained by AGU-triloop are 2 to 3-fold weaker, the relative difference between HEXIM and Tat are similar (see below).
Our previous work showed that the Tat NL4-3 (KR 52 RQRRR) ARM represents the interaction domain between Tat and 7SK-SL1 apical and has an approximately two-fold increased affinity over the HEXIM N-ARM , which provides an explanation for HEXIM displacement 45 . ITC traces show that Tat Subtype G's ARM (KR 52 RHRRR), which also has two N-terminal arginines, binds 7SK-SL1 apical-AGU with a K d of 81 ± 10 nM (N = 1.1 ± 0.1; Fig. 1e), which is an approximately 2.8-fold increased binding affinity over HEXIM N-ARM (Supplementary Table 1). On the other hand, Tat Finland's ARM (KR 52 KHRRR), despite having an additional N-terminal arginine compared to the HEXIM N-ARM (R52 and K151, respectively), does not have a statistically significant increase in binding affinity over HEXIM N-ARM (K d of 172 ± 10 nM, N = 1 ± 0.02; Fig. 1f). Overall, these results highlight the need for understanding the HEXIM-bound 7SK; while the increased Tat G affinity would allow for HEXIM displacement, it is unclear how Tat Fin can achieve the same biological output.
Preformed configurations of ASM 1 and ASM 2 provide a common mode of interaction with C-terminal arginines. To understand how the HEXIM N-ARM and the various Tat ARMs interact with 7SK-SL1 apical-AGU , we utilized a combination of small-angle X-ray scattering (SAXS) and NMR. All reconstructed ab initio SAXS envelopes showed no major overall global changes between peptide-bound and free 7SK-SL1 apical-AGU ( Supplementary Fig. 3). Numerous intermolecular NOEs place both HEXIM and Tat arginine-rich motifs into the major groove of the RNA and allow us to define their interactions with all ASM regions. Base pairs in the lower part of the stemloop below the G 79 -U 32 base pair, as well as the CAGUG pentaloop do not give any intermolecular NOEs, indicating that the interactions are contained within a single turn of the helix ( Fig. 2 and Table 1). In the free 7SK-SL1 apical-AGU , ASM 1 and ASM 2 are placed in tandem orientation, and upon titration of the various ARMs, all expected NOEs for such configurations are retained. Unlike a typical ASM where the following nucleotide after the bulge is in a canonical Watson-Crick base pair, in ASM 1 , the residue A 77 is configured into an A 34 -A 77 base pair. A NOE from the A 77 H8 proton to the H1′ of C 75 positions this residue under the C 75 cap ( Supplementary Fig. 4). This confirms a planar orientation of C 75 with the C 33 -G 78 base pair and configures A 77 in such a way that it is perfectly positioned to interact with the guanidinium moiety of R156 in HEXIM N-ARM and R57 in Tat Fin and Tat G , which intercalate between C 75 and G 74 in a manner identical to canonical ASMs .
Similarly, in ASM 2 , the C 71 + base also retains its protonation at the N3 position, as evidenced by a downfield shift of the N4 amino protons ( Supplementary Fig. 8). The guanidinium moiety of R155 in HEXIM N-ARM and R56 of Tat Fin and Tat G interact with G 73 by intercalating between the C 71 + cap and G 70 base of the motif (Supplementary Figs. 5-7). Additionally, intermolecular NOEs from the aromatic protons of the C 75 and C 71 + caps and the G 74 and G 70 bases of ASM 1 and ASM 2 to the Hγ and the Hδ protons confirm that consecutive arginines R156 and R155 interact in a ladder-like configuration with the tandem performed motifs ASM 1 and ASM 2 , respectively ( Fig. 3a and Supplementary Fig. 6c). Such NOEs are also observed in both the Tat Fin and the Tat G -bound complexes, confirming the similar placement of the C-terminal R57 and R56 into the tandem ASM 1 and ASM 2 , respectively ( Fig. 3a and Supplementary Fig. 7a, b, d, e). Taken together, the structures reveal a common mode of interaction between the nonvarying C-terminal arginines and the tandem ASMs.
Rearrangement of pseudo-ASM 3 allows for HEXIM N-terminal interactions. In the free 7SK-SL1 apical-AGU , pseudo-ASM 3 and ASM 4 adopt a pseudo-symmetrical architecture where the two motifs are spatially opposed. Upon HEXIM N-ARM binding, the pseudo-ASM 3 maintains its U 40 :A 43 -U 66 triple-base interaction although the base of the sandwich, A 39 , rearranges from a reverse Hoogsteen interaction with U 68 into a cis-Hoogsteen/sugar interaction, giving rise to an alternate pseudo configuration. (Fig. 3b). This is evidenced both by NOEs from the U 68 imino proton to the A 39 amino protons and NOEs from the U 68 H2′ and H3′ protons to the A 39 H8 proton ( Supplementary Fig. 8b). This frees up the U 68 imino proton to engage the backbone carbonyl of K152 while simultaneously bringing the N1 proton acceptor of A 39 into the major groove to hydrogen-bond with the side chain Hε protons of K151 (Fig. 3c). Thus, both K151 and 152 can enter deep into the major groove by remodeling the pseudo-ASM 3 .
The amino side chain of K151 is within hydrogen-bonding distance of the A 39 N1 nitrogen as evidenced by NOEs from the K151 Hγ and Hβ protons to the C 37 H6 and H5 protons, respectively, and from the K151 Hε protons to the C 38 H6 and H5 protons ( Fig. 3c and Supplementary Fig. 6f). Additionally, NOEs between the K152 Hβ protons with the U 68 H5 proton, the K152 Hδ protons with the C 67 and U 66 H5 protons, and the K152 Hε protons with the C 67 H5 and H6 protons position the amino side chain of K152 within hydrogen-bonding distance of the C 67 backbone ( Fig. 3c and Supplementary Fig. 6e, f). This gives rise to a forked configuration of the two lysines, orienting the side chain amino groups towards the phosphate backbones on opposite ends of the groove.
Unlike the other three ASMs, where the NOEs clearly define a single predominant structural configuration as described above, multiple dynamic states exist for ASM 4 (see below). In the most abundant form, the preformed nature found in the free state is retained as evidenced by a direct imino-to-imino connectivity between U 44 and U 63 along with maintenance of the G 46 -C 62 Watson-Crick base pair ( Supplementary Fig. 8c). In fact, this interaction is stabilized by K150, which displays NOEs between the Hε protons with the U 63 and the U 40 H5 protons, positioning the amino side chain within hydrogen-bonding distance of the O4 atoms of both U 63 and U 40 (Supplementary Fig. 6d). Additional intermolecular interactions between the U 40 H5 proton and the U 63 H5 and H1′ protons with the K150 Hδ, Hγ, and Hβ protons places K150 directly under the U 63 cap of ASM 4 ( Supplementary  Fig. 6d, e). Taken together, these data show that despite the lack of arginines, the lysine-rich N-terminus of HEXIM N-ARM can be accommodated by 7SK: the Watson-Crick face of A 39 turns from the minor into the major groove to interact with K151 and 152, which then positions K150 to interact with the oxygen-rich environment of the U 63 and U 40 caps.
Conformational plasticity of the ASM 3 /ASM 4 region provides differential mode of interactions with N-terminal and spacer residues. Our previous study showed that Tat NL4-3 displaces HEXIM by remodeling the pseudo-ASM 3 into a canonical ASM 3 to allow for arginine intercalation 45 . Furthermore, an additional arginine docks into the preformed ASM 4 . While the mechanism of remodeling pseudo-ASM 3 is conserved upon binding of both Tat Fin and Tat G ARMs (Fig. 3b, f and Supplementary Fig. 6), both the drivers of the conformational switch and the engagement of the ASM 4 vary depending on differences in amino acid sequences.
While Tat Fin has two major differences from Tat NL4-3 (K53 to R53 and spacer H54 to Q54, respectively), it only differs by a single amino acid from HEXIM (R52 and K151, respectively). Like Tat NL4-3 , R52 is responsible for remodeling pseudo-ASM 3 ( Fig. 3c and Supplementary Fig. 7a). However, while R53 in Tat NL4-3 flips over R52 and engages ASM 4 , the equivalent K53 stays in the spacer region between the ASM 1 /ASM 2 and ASM 3 /ASM 4 regions in a manner similar to HEXIM as evidenced by NOEs between the K53 Hβ protons with the U 68 H5 proton, the K53 Hδ protons with the C 67 and U 66 H5 protons, and the K53 Hε protons with the C 67 H5 and H6 protons, which position the amino side chain of K53 within hydrogen-bonding distance of the C 67 backbone ( Fig. 3c and Supplementary Fig. 7a).
As for HEXIM, ASM 4 remains unoccupied upon binding Tat Fin and the structure shows that the K51 amino side chain is positioned to hydrogen-bond with the U 63 ribose ring in a stabilizing interaction (Fig. 3c). This is evidenced by NOEs of the K51 Hδ protons with the U 63 H5 and H1′ protons and the K51 Hε protons with the U 63 2′ hydroxyl proton ( Supplementary  Fig. 7c). Furthermore, the N-terminal K50 exits near the apical loop, with NOEs observed of the K50 Hδ and Hε protons with the C 38 and C 37 H5, and H1′ protons position the amino side chain of K50 to the C 38 phosphate backbone ( Fig. 3c and Supplementary  Fig. 7a).
Finally, in evaluating the structural consequences of the spacer substitution, we see that H54 and R55 in Tat Fin remain near ASM 1 and ASM 2 , similar to what is found in HEXIM. This is evidenced by NOEs of the H54 (H153 in HEXIM) Hβ protons with the C 35 , C 36 , and C 37 H5 protons, placing H54 near ASM 2 , whereas the R55 (R154 in HEXIM) Hδ protons display NOEs with the A 34 H1′ proton and the C 33 H1', H5, and H6 protons, positioning this spacer residue near ASM 1 (Fig. 3d and Supplementary Figs. 6, 7). This is in contrast with the binding mode of Tat NL4-3 in which the intercalation of R53 into ASM 4 drags both the Q54 and R55 spacer residues towards the apical ASMs.
The importance of the histidine H54 spacer is even more evident in the Tat G strain where it represents the only difference from Tat NL4-3 . This single difference changes the identity of the arginine that remodels pseudo-ASM 3 . In this ARM, the positioning of H54 near ASM 2 precludes R53 from reaching ASM 4 to accomplish the inverse intercalation seen in NL4-3 ( Fig. 3d and Supplementary Fig. 7d, e). The interactions with the apical ASMs thus occur in a ladder-like manner where R53 intercalates into the remodeled ASM 3 whereas R52 intercalates into ASM 4 (Fig. 3c, d and Supplementary Fig. 7a, d, e). K51 makes the final stabilizing interaction with NOEs seen between the Hε protons and the U 63 2′ hydroxyl proton, indicating a hydrogen-bonding interaction between the K51 amino side chain and the U 63 ribose ring ( Fig. 3d and Supplementary Fig. 7f).
Taken together, these studies show that arginine sandwich motifs provide mini domains that arginine-rich motifs of proteins can differentially interact with to achieve deep major groove binding into the stem of 7SK-SL1 apical-AGU .
HEXIM allows for increased conformational sampling of apical ASMs. While titration of all four arginine-rich motifs stabilizes the majority of 7SK-SL1 apical-AGU into one predominant configuration, the HEXIM ARM is an outlier wherein binding causes ASM 1 and ASM 4 to become destabilized and exhibit multiple conformations (Fig. 4a, b and Supplementary Fig. 4). In such conformations, the NOEs between the imino protons of U 63 and U 44 disappear, indicating the disruption of the U 63 :U 44 -A 65 triple and loss of ASM 4 ( Supplementary Fig. 8c). The destabilization of this region is also indicated by the line-broadening of K150, which interacts with U 63 in the folded configuration (Supplementary Fig. 6e).
To evaluate the implication of HEXIM's ability to locally destabilize ASM 1 and ASM 4 in the context of its displacement required for transcriptional regulation, we compared Tat Fin and Tat G ARM binding to 7SK-SL1 apical both free and in the presence of In the case of Tat Fin (middle), K53 also acts as a spacer residue to allow for the remodeling of ASM 3 by R52.
HEXIM N-ARM . Due to the modest differences in binding energetics between the different ARMs, competition experiments using ITC were not tractable. A 1:1 titration of both Tat G and Tat Fin into 7SK in the NMR shows the ability to completely engage ASM 2 and ASM 3 , while a significant fraction of ASM 1 and ASM 4 shows the presence of free configurations, indicating reduced access for the termini. However, upon titration of both Tats into the HEXIM-bound 7SK complex, we observe not only complete engagement of all ASMs but also a total displacement of HEXIM (Fig. 4a, b). This is especially striking given that the binding affinities of Tat Fin and HEXIM for free 7SK are equivalent. Taken together, these data indicate that Tat can better engage 7SK that is destabilized by HEXIM at the outer ASMs. Finally, ITC data of full-length HEXIM bound to full-length 7SK snRNA (N = 1.9 ± 0.1; Fig. 4d) show that an entropy-driven interaction is maintained and, in fact, is even more pronounced (−TΔS = −6.4 ± 1.3 kcal mol −1 , ΔH = −2.6 ± 1 kcal mol -1; Fig. 4c), suggesting that HEXIM binding may globally increase the conformational space sampled by the 7SK snRNP complex. These studies suggest that destabilization by HEXIM may play an important role in how transcription factors access 7SK for pTEFb capture.

Discussion
The 7SK snRNP represents a central biomolecule that a wide range of transcriptional factors needs to interact with to access pTEFb to control transcriptional elongation. In particular, pTEFb extraction by HIV Tat from this complex requires manipulating the interaction between the 7SK snRNA and the HEXIM adapter protein.
In this study, we solved the structures of the RNA binding domains of HEXIM and Tat bound to 7SK and gained several insights into their functional significance, including the The structures show that both HEXIM and Tat directly bind the stem of 7SK-SL1 apical through intercalation of arginine-rich motifs into an entire helical turn of the major groove. This is unusual as RNA major grooves are deep and narrow, making them generally inaccessible for protein binding. The architecture of the four sandwich motifs in 7SK allows for transcriptional regulators to differentially utilize their ARMs. On the one hand, the tandem preformed ASMs, ASM 1 and ASM 2 , remain unchanged from their free configuration upon encountering the C-terminal arginines of Tat and HEXIM. On the other hand, the apical pseudo-symmetrical ASMs, pseudo-ASM 3 , and ASM 4 , reconfigure depending on their binding partners. The structures show that the ASM 3 region can adopt at least three different base pair interactions: a reverse Hoogsteen in the free state, a cis-Hoogsteen/sugar interaction upon HEXIM binding, and a Watson-Crick base pair upon Tat binding. The cis-Hoogsteen/ sugar interaction is especially significant because it allows HEXIM to enter the major groove despite the lack of arginines in the N-terminus. Similarly, while ASM 4 retains its preformed configuration found in the free state upon Tat binding, it can be destabilized in the presence of HEXIM and adopt multiple flexible states. Taken together, these studies show that 7SK is adaptable in its ASM architecture, which can be modulated upon encountering different transcription factors.
Comparative analyses of HEXIM and Tat provide insights into how both positive and negative regulators can manipulate 7SK to carry out their transcription roles. Our studies implicate HEXIM as potentially having a dual structural role. On the one hand, it can bind with high affinity to the apical portion of 7SK-stemloop-1, and on the other hand, it simultaneously causes local destabilization of this region, enhancing the binding of a positive regulator such as Tat. In comparison to Tat, the thermodynamic profile and solution-state characteristics of HEXIM binding show an entropydriven mode of interaction that is particularly attributed to the destabilization of ASM 1 and ASM 4 regions, indicating a mechanism in line with conformational selection. Indeed, mutational studies have shown that deletion of U 63 significantly reduces HEXIM binding 37,43 . This expansion in the dynamic state of 7SK surrounding the ASM 1 and ASM 4 region is also supported both by in vivo SHAPE analysis where U 63 and C 75 become ultra-reactive upon HEXIM:pTEFb binding 53 . Such increased conformational sampling was also demonstrated by structural and molecular dynamics modeling 45,[52][53][54][55][56][57] . Furthermore, we show that Tat capitalizes on this increased dynamic state, binding to more motifs with greater affinity to the HEXIM-bound complex than to free 7SK. While the use of a HEXIM-displacement mechanism for pTEFb capture by binding to 7SK-SL1 has yet to be discovered for cellular factors, the destabilization-driven preparation of 7SK snRNP may potentially be a general feature exploited by specialized transcriptional factors.
Comparative analysis of HEXIM and Tat also sheds light on the sequence requirements of ARMs for 7SK binding. While N-terminal lysines of HEXIM allow for destabilization of ASM 4 , the anchoring required to enter the major groove can only be provided by the stacking of C-terminal arginines within ASM 1 and ASM 2 . Indeed, the importance of these C-terminal arginines for HEXIM binding is supported by their nearly complete conservation across metazoan species 58 . Conversely, the equivalent arginines in HIV-1 Tat occur as a consecutive pair only in~50% of reported strains, albeit with the strong requirement of at least one arginine. The structures show that these variations may be possible due to the anchoring provided by the arginines that intercalate into the apical ASMs. Nevertheless, when two arginines are present in Tat, the interactions with the tandem ASMs mirror HEXIM. Furthermore, differences in N-terminal and spacer ARM residues orchestrate the structural modulations of the apical ASMs. To accommodate the continuation of the HEXIM ARM chain from the ASM 1 /ASM 2 to the ASM 3 /ASM 4 region required for U 63 destabilization, K152 induces the reconfiguration of pseudo-ASM 3 from a reverse Hoogsteen to a cis-Hoogsteen/sugar interaction. In all variations of N-terminal Tat sequences studied, binding is concomitant with the rearrangement of pseudo-ASM 3 into a canonical ASM 3 through the intercalation of an arginine.
The structures also provide insights into specific sequence variations that occur in the highly conserved Tat ARM to displace HEXIM. When two arginines are available in the N-terminal residues, both are involved in arginine sandwich interactions, providing a twofold increase in affinity; however, either R52 (Tat NL4-3 ) or R53 (Tat G ) can act as the remodeler. This can be explained by the presence of either glutamine or histidine spacer, respectively, which is the only amino acid difference between the two strains. As glutamine (75%) and histidine (15%) make up most of the sequence variation in this spacer, the structures show that these two spacer residues drive the differential positioning of the arginine remodeler. In the Tat Fin strain, which has a histidine spacer, it is the R52 that acts as a remodeler. In this case, the R53K substitution provides the stabilizing interactions to reposition the single R52 arginine near pseudo-ASM 3 .
It is also interesting to compare the mode of binding of Tat Fin to HEXIM. First, the single residue difference (R52 vs K151) provides Tat Fin with the additional ASM intercalation required for displacement. Thus, Tat has evolved specific sequence variations that allow for the reconfiguration of pseudo-ASM 3 , even in cases where there is only a single variation from HEXIM. Second, despite both ARMs having lysines positioned near ASM 4 , only HEXIM leads to local destabilization. Our studies, therefore, provide HEXIM as an example of a negative regulator that primes its own displacement by locally destabilizing 7SK. Overall, these studies have broader implications for 7SK snRNP-mediated regulation. Given that the destabilization-driven displacement is a robust mechanism, it is possible that other yet-to-be-identified cellular and viral transcriptional regulators recruit pTEFb through direct intercalation of ARMs into 7SK-SL1 apical . Furthermore, as a destabilized state of 7SK snRNP is what is presented to all transcriptional regulators, the mechanisms necessary to extract pTEFb may converge on capitalizing on this conformational heterogeneity.
Methods RNA sample preparation. RNA samples used for biophysical experiments were synthesized by in vitro transcription using T7 RNA polymerase with either plasmid DNA or with synthetic DNA templates containing 2′-O-methylated (Integrated DNA Technologies) containing the T7 promoter and the desired sequences. Plasmid DNA for 7SK-SL1 Full-WT and 7SK-SL1 Full-AGU containing the T7 promoter, insert, and SmaI recognition sequence were cloned by Genscript in between the EcoRI and BamHI restriction sites of a puc19 vector. Plasmid DNA was prepared for in vitro transcription from a 5 mL overnight culture of NEB 5α Competent E.coli (C29871) transformed with the plasmid using Qiaprep Spin Miniprep Kit (Qiagen 27104). 10 μL of purified DNA were combined with 25 μL of 2′-Omethylated reverse primer at 100 μM (5′-mGmGAGCGGTGAGG GAGGAAG-3′ where m indicates 2′ O-methylated nucleotides), 25 μL of forward primer at 100 μM (5′-GACAAGCCCGTCAGGG-3′), 2.44 mL of water, and two tubes of EconoTaq PLUS 2X Master Mix (Lucigen 30035-2). The 5 mL mixture was then aliquoted into 50 μL increments in a 96-well PCR plate and the templates for in vitro transcription reactions were amplified using the following PCR protocol: 95°C for 5 min, 34 cycles of (95°C for 30 s, 50°C for 1 min, and 68°C for 90 s), and 68°C for 5 min. After PCR amplification, reactions were pooled into 5 mL volume in a 50 mL Falcon tube and 0.5 mL of 3 M sodium acetate, pH 5 and 32 mL of 100% ethanol were added to the mixture and chilled at −80°C for at least 30 min before spinning down at 9000×g for 10 min at 4°C. The ethanol was decanted, and the pellet was left to dry overnight before in vitro transcription use. Template preparation for 7SK-SL1 apical-AGU using 2′-O-methylated reverse primers in order to suppress the heterogeneity at the 3′ end of the transcripts involved combining 15 mL of both forward (5′-TAATACGACTCACTA TAGGGATCTGTCACCCCATTGATCGCCAGTGGCTGATCTGGCTGGCT AGGCGGGTCCC-3′) and reverse (5′-mGmGGACCCGCCTAGCCAGCCAG ATCAGCCACTGGC GATCAATGGGGTGACAGATCCCTATAGTGAGTCG TATTA-3′ where m indicates 2′ O-methylated nucleotide) primers at 1 mM stock solution with 470 mL of water 59 . The mixture was heated at 95°C for 5 min and cooled at room temperature for 30 min before assembling the in vitro transcription reaction. Samples were either unlabeled or residue-specifically labeled with 13 C/ 15 N-or 2 H (Cambridge Isotope Laboratories, Inc.). After transcription, RNA samples were heat denatured and purified by using urea-denaturing polyacrylamide gels. The same in vitro transcription reaction protocol was done for 7SK-SL1 apicalΔASM using a forward (5′-TAATACG ACTCACTATAGGG ATCTGTCACCCCAGATCGCCAGTGGCGATCTGGGGAGGCGGGTCCC-3′) and reverse (5′-mGmGGACCCGCCTCCCCAGATCGCCACTGGCGATCT GGGGTGACAGATCCCTATAGTGAGTCGTATTA-3′ where m indicates 2′ O-methylated nucleotide).
HEXIM ARM and Tat ARM peptide preparation. Unlabeled HEXIM N-ARM (GISYGRQLGKKKHRRRAHQ), Tat Fin ARM (GISYGRKKRKHRRRAHQ), and Tat G ARM (GISYGRKKRRHRRRAHQ) peptides were purchased from Tufts University Core Facility at a 0.1 mmol scale. Tat adapters were placed around the HEXIM N-ARM sequence to prevent non-physiological aggregation in solution-state NMR studies. HEXIM N-ARM peptides containing selective 13 C/ 15 Nlabeled residues, underlined, (GISYGRQLGKKKHRRRAHQ and GISYGRQLGKKKHRR-RAHQ) were purchased from New England Peptide.
Full-length HEXIM1 preparation. Synthetic DNA encoding HEXIM1 (2-359) was cloned into a bacterial pMCSG7 expression vector 59 encoding an N-terminal tobacco etch virus (TEV) protease-cleavable His 6 tag and was expressed in E. coli BL21 AI cells in an overnight culture at 20°C. Cells were lysed by sonication in buffer containing 50 mM Tris pH 8.0, 500 mM NaCl, 0.1% β-mercaptoethanol, 50 mM (NH 4 ) 2 SO 4 and protease inhibitor aprotinin and leupeptin. His 6 -HEXIM1 was purified from the cleared cell lysate using Ni-NTA resin (Qiagen) and the His 6 tag was cleaved with TEV protease. The HEXIM was run over a second Ni-NTA column, followed by anion exchange on a 5 mL HiTrap Q HP column (Cytiva) and gel filtration on a Superdex 200 16/60 column (Cytiva) in a final buffer containing 25 mM HEPES pH 7.5, 200 mM NaCl, 5% glycerol, 1 mM TCEP. HEXIM was flash frozen in liquid nitrogen and stored at −80°C.
Small-angle X-ray scattering. SAXS data for the 7SK-SL1 apical-AGU :HEXIM N-ARM , 7SK-SL1 apical-AGU : Tat Fin ARM, and 7SK-SL1 apical-AGU :Tat G ARM complexes were obtained at SIBYLS beamline of Advanced Light Source at Lawrence Berkeley National Laboratory. Measurements were performed in a buffer containing 10 mM sodium phosphate, 70 mM NaCl, 0.1 mM EDTA, pH 5.2, and the background scattering was subtracted from the sample scattering to obtain the scattering intensity from the solute molecules. Data from three different concentrations (50, 75, and 100 μM) were compared with scattering intensities at q = 0 Å −1 [I(0)], as determined by Guinier analysis, to detect possible interparticle interactions. Data were analyzed by using ScÅtter software, and the presented DAMAVER envelope structures were reconstructed by using DAMMIF/DAMMIN software from 23 independent DAMMIF runs. Chi-squared values of SAXS profiles were analyzed on FoXS 60,61 .
NMR data acquisition, resonance assignment, and structural calculations. For NMR experiments, the Tat ARM/HEXIM N-ARM :7SK-SL1 apical-AGU complexes were dissolved in a buffer containing 10 mM potassium phosphate, 70 mM NaCl, and 0.1 mM EDTA, pH 5.2, whereas the full-length HEXIM1:7SK-SL1 apical complex was in a buffer with 25 mM HEPES pH 7.5, 200 mM NaCl, 5% 2 H-glycerol, and 1 mM TCEP. All NMR experiments were acquired by using Bruker 700 or 800 MHz instruments equipped with cryogenic probes. Spectra for observing nonexchangeable protons were collected at 298 K in 99.96% D 2 O, whereas those for exchangeable protons were at 283 K and 298 K in 10% D 2 O. For NOESY experiments, mixing times were set to 200 ms. To help unambiguously assign the intermolecular NOEs of the HEXIM N-ARM with 7SK-SL1 apical-AGU , we used both specifically protonated GA, AC, and GU samples of 7SK-SL1 apical-AGU and two HEXIM N-ARM peptides synthesized by with different combinations of 13 C/ 15 N-labeled amino acids. Samples of the 7SK-SL1 apical-AGU :HEXIM N-ARM, the 7SK-SL1 apical-AGU :Tat Fin ARM, and 7SK-SL1 apical-AGU :Tat G ARM were prepared at 1:0.9 equivalents, whereas the full-length HEXIM1:7SK-SL1 apical-AGU complex was prepared at 1:0.3 equivalents to avoid any nonspecific binding or aggregation of the protein to the RNA. Assignments for non-exchangeable 1 H, 13 C, 15 N signals of 7SK-SL1 apical-AGU in complex with HEXIM N-ARM and Tat ARMs were obtained by analyzing two-dimensional 1 H-1 H NOESY recorded with non-labeled samples and two-dimensional 13 C-HMQC and 15 N-HSQC and three-dimensional 13 C-edited HMQC-NOESY spectra for labeled samples.
The CYANA structure with the lowest target function was used as the initial model for structure calculations Xplor-NIH to incorporate electrostatic constraints. First, structures were calculated using annealing from 2000°C to 25°C in steps of 12.5°C. Standard energy potential terms for bonds, angles, torsion angles, van der Waals interactions, and interatomic repulsions were included. The statistical backbone H-bond potential was utilized for protein residues. Energy potentials for NOEs, hydrogen bonds, and planarity were incorporated with restraints derived from NMR data. All restraints used in CYANA were included except for phosphate-phosphate distances. The structures were sorted by energy using bond, angle, dihedral, and NOE energy potential terms, and the ten percent of the structures with the lowest sort energy were further minimized with SAXS terms to incorporate orientation restraints. For this step, minimization started at 1500°C to 25°C in steps of 12.5°C. The lowest ten percent of these were deposited in the RCSB databank.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.